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Abstract 

We show the closed-form solution to the maximization of tr(A T R), 
where A is given and R is unknown rotation matrix. This problem oc- 
curs in many computer vision tasks involving optimal rotation matrix 
estimation. The solution has been continuously reinvented in different 
fields as part of specific problems. We summarize the historical evolution 
of the problem and present the general proof of the solution. We con- 
tribute to the proof by considering the degenerate cases of A and discuss 
the uniqueness of R. 

1 Introduction 

Many computer vision problems that require estimation of the optimal rotation 
matrix reduce to the maximization of tr(A T R,j3 for a given matrix A: 

max tr(A T R), s.t. R T R = I, det(R) = 1. (1) 

For instance, to estimate the closest rotation matrix R to the given matrix A, 
we can minimize: 

min ||R- A\\ 2 F = tr((R— A) T (R— A)) = tr(R T R+A T A) -2 tr(A T R) = 
tr(I + A T A) - 2 tr(A T R) = -2 tr(A T R) + const. (2) 

which is equivalent to the problem in Eq. [T] Historically, matrix R was first 
constrained to be only orthogonal (det(R) = ±1), which includes rotation and 
flip. A brief list of the optimization problems that simplify to the maximization 
of tr(A T R) include: 



1 Matrix trace, tr(), stands for a sum of diagonal elements of the matrix. tr(A T R) also 
represents a Frobenius inner product, which is a sum of element- wise products of matrices A 
and R. 
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• min || R — A|| F : the closest orthogonal approximation problem [TJ [5], 

• min R-Yi) 2 = l|X — RY||_p : orthogonal Procrustes problem [US], 
where X = (xi, . . . , xjv) t , Y = (yi, . . . , yN) T are matrices whose columns 
are formed from the point position vectors, 

• min J2i( x * ~ (sR-Yi + t)) 2 = ll x - ( sRY + T)|| F : Absolute orientation 
problem (generalized Procrustes problem) [51 [5] , where s is a scaling con- 
stant and t,T are translation vector and matrix respectively, 

• max tr(R T A): Scott and Longuet-Higgins [7] correspondence estimation, 
where A is a proximity matrix. 

2 The Lemma 

Lemma 1. Let R]j x rj be an unknown rotation matrix and A.r> X D be a known 
real square matrix. Let USSV T be a Singular Value Decomposition (SVD) of 
A, where UU T = VV T = I, SS = d(s l ),s 1 > s 2 >, . ,> s D , > 0. Then the 
optimal rotation matrix R that maximizes tr (A T R) is 

R = UCV T , where C = d(l, 1, ■ • ■ , 1, det(UV T )). (3) 

Matrix R is unique for any A, except for two cases: 

1. rank(A) <D-1, 

2. det(A) < and the smallest singular value, sd, is not distinct. 

3 History of the problem 

The lemma has been reinvented repeatedly in various formulations in various 
fields. Historically, the problem was constrained to be only orthogonal. Here, 
we try to summarize the historical flow of the problems and its solutions that 
include the lemma. 

In 1952, Green showed the solution to orthogonal Procrustes problem 
in the special case of the full rank positive definite A, where R is orthogonal. 
In 1966, Schonemann [3] generalized the Green's solution to the arbitrary A 
and discussed the uniqueness of R. In 1981, Hanson and Norris [4] presented 
the solution for strictly rotation matrix R. Unfortunately, this work has not 
received the widespread attention. 

In the context of the closest orthogonal approximation problem, similar so- 
lution has been independently found in 1955 by Fan and Hoffman using polar 
decomposition [TJ [2] . 
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In 1987, Arun et al. [5] presented the solution to the absolute orientation 
problem, and re-derived the lemma for orthogonal R, presumably being not 
aware of the earlier works. In the same year, similar to Aran's solution was 
independently obtained by Horn et al. [5] ■ 

In 1991, based on the Arun's work, Umeyama [B] presented the proof for the 
optimal strictly rotational matrix, once again, being not aware of Hanson and 
Norris, and Schonemann works. As we shall show, Umeyama did not consider 
all possible solutions, specifically for the degenerate cases of A, which makes 
his proof slightly incomplete. 

Here, we prove the lemma in general case, mainly, following the Umeyama's 
work [BJ. In particular, we shall also consider the degenerate cases where A 
has not-distinct singular values, which was only briefly mentioned by Hanson 
and Norris [4j, but otherwise, to our best knowledge, never considered for the 
estimation of the optimal rotation matrix R. 



4 Proof of the Lemma 

We convert the constrained optimization problem into unconstrained using La- 
grange multipliers. Define an objective function / to be minimized as 

min /(R) = - tr(A T R) + tr ((R T R - I)A) + A(det(R) - 1), (4) 

where A is a symmetric matrix of unknown Lagrange multipliers and A is another 
unknown Lagrange multiplier. Equating to zero the partial derivatives of / with 
respect to R, we obtain the following system of equations: 

df 

= A + RA + AR = RB A = 0. (5) 

where B is symmetric by construction: B = A + Al. Thus we need to solve a 
linear system of equations: 

A = RB, s.t. R T R = I, dct(R) = 1. (6) 

Transposing Eq. [6] and multiplying from both sides we obtain: 

A T A = B 2 . (7) 

The matrix A T A is guaranteed to be symmetric and positive definite (or semi- 
definite if A is singular) , and we can decompose it using spectral decomposition: 

B 2 = A T A = VSS 2 V T , (8) 

where SS 2 is real non-negative diagonal matrix of eigenvalues of A T A as well 
as B 2 , so that s 2 > s\ >,...,> sjj, > 0. Also, note that the matrix SS is real 
non-negative diagonal matrix of the singular values of A. 
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Clearly, matrices B and B 2 are both symmetric with commutative property: 
BB 2 = B 2 B, hence both share the same eigenvectors, only when B 2 is not 
degenerativ^]. Thus matrix B is in the form: 

B = VMV T (9) 

where M is real diagonal matrix with eigenvalues of B, which must be in the 
form: M = d(±si, ±s 2 , ■ ■ ■ , ±s D ). 

In the degenerate case of A (but still valid), SS, as well as SS 2 , has repeated 
values, and matrix M does not have to be diagonal. M has symmetric block- 
diagonal structure with the number of blocks equal to the number of distinct 
values in SS 2 . To see it happening, note that 

SS 2 M = V T B 2 VV T BV = V T B 2 BV = V T BB 2 V = MSS 2 , (10) 
SS 2 M - MSS 2 = 0, =► (11) 

(a?-*2)my = 0, V*J (12) 

where sf, rriij are the elements of SS 2 and M respectively. If all the sf are 
distinct, then we conclude that my = 0, Vi ^ j and M is diagonal. If not all sf 
are distinct, then my = only if sf / s 2 , and thus M is block-diagonal formed 
from square symmetric blocks corresponding to repeated values Sj. 

Now, we consider the following cases separately: A is non-singular and non- 
degenerative (all singular values are distinct), A is non-singular and degenera- 
tive, A is singular. 

Non-degenerative case of A: M is diagonal. Substituting M into equation 
Eq. [5] and then into the objective function, we obtain: 

tr(A T R) = tr(B T R T R) = tr(B) = tr(VMV T ) = tr(M) (13) 

Taking into account that det(R) = 1, from Eq. [6] we see that 

det(A) = det(R) det(B) = det(B) = det(V) det(M) det(V T ) = det(M), (14) 

hence det(M) must have at least the same sign as det(A). Clearly, matrix M 
that maximizes its trace is 

M = d(si,s 2 ,...,s D ), i/det(A)>0, (15) 

M = d(si,s 2 ,...,-s D ), i/det(A)<0. (16) 

and the value of objective function at the optimum is 

tr(A T R) = tr(M) = s a + s 2 +, . . . , +s D -i ± s D (17) 

where the last sign depends on the determinant of A. 



2 Here by degenerative matrix we mean a matrix with not distinct (repeated) singular 
values. Note, that a matrix can be non-singular, but still degenerative. 
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Now, we can find the optimal rotation matrix R, from the Eq. [5) 



A = RB, (18) 
USSV T = RVM V T , (19) 
XJSS = RVM. (20) 

If A is non-singular (rank(A) = D), then M is invertable, and the optimal R is 

R = US'5M _1 V T = UCV T , where C = d(l,l,...,l,dct(UV T )). (21) 

where det(UV T ) = det(U) det(V T ) = sign(det(A)) = ±1 depending on a sign 
of det(A). 

Degenerative case of A : M is symmetric block diagonal. Since M is 
symmetric, it can be diagonalized using spectral decomposition: M = QNQ T , 
where Q is orthogonal and also block-diagonal with the same block structure as 
M. Matrix N is real and diagonal. 

B 2 = VSS 2 V T = VM 2 V T , =>• (22) 
SS 2 = M 2 = QN 2 Q T ,^> (23) 
N 2 = Q T 55 2 Q = SS 2 . (24) 

The last equality holds, because SS 2 has multiples of identity along the diagonal, 
which correspond to the repeated values. Matrix Q is orthogonal block diagonal, 
where each block has a corresponding multiples of identity in SS 2 . Using direct 
matrix multiplication you can see that Q T SS 2 Q = SS 2 . 

Thus N = d(isi) ±S2, . . . , ±s_d), and the value of objective function at the 
optimum is tr(M) = tr(QNQ T ) = tr(N). Taking into account the sign of 
determinant of A we conclude that N that maximizes its trace is in the form: 

N = d(si,s 2 ,...,s D ), z/det(A)>0, (25) 

N = d(s 1 ,s 2 ,...,-s D ), i/det(A)<0. (26) 

The objective function at the optimum is tr(N) = si + S2+j • • • , +S.D-1 ± sp, 
which is exactly the same value as in Eq. 1171 when M is diagonal. Thus, in 
the degenerate case there is a set of block-diagonal matrices M, which give the 
same objective function value as for the diagonal M. 

Now, let us consider the form of M for the optimal choices of N. When 
det(A) > 0, from Eq. [25j we have: 

M = QNQ T = QSSQ T = SS = N, (27) 

where the orthogonal matrix Q vanishes with corresponding multiples of iden- 
tity in SS. Thus if det(A) > the optimal M is unique. In the case when 
det(A) < (Eq. [26]) . equality QNQ T = N holds only if the smallest element 
so is not repeated. If so happen to be repeated, and det(A) < 0, then M is not 
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unique, and there is a set optimal solutions M = QNQ T , which is unavoidable. 
However, even in this case, it is always possible to choose M to be diagonal 
(Eq. fTB)) . Similar to the non-degenerative case, if A is non-singular, the optimal 
R is found in the same way as in Eq. |2"T1 

R = US'5M _1 V T = UCV T , where C = d(l,l,...,l,dct(UV T )). (28) 

However, consider the uniqueness of SVD of A and computation of R in the 
degenerative case. We know that if the singular values of A are not distinct, 
then SVD of A is not unique. In particular, any normalized linear combination 
of singular vectors corresponding to the same singular value is also a singular 
vector. Consider SVD of degenerative A: 

A = USSV T , (29) 
U = AVSS- 1 (30) 

Accordingly to Eq. [2S 

R = UCV T = AVSS^CV 11 (31) 

If det(A) > 0, R = AVSS~ 1 'V T , which means that R is unique, eventhough V 
is not. This is because SS -1 has the same repeated elements as singular values 
in SS, then VS'S'~ 1 V T is a unique matrix, and thus R is uniquely determinted. 

If det(A) < and the smallest singular value is not distinct, then the ro- 
tation matrix R = AVSS~ 1 CV T is not unique, because different SVD of A 
produce different V, and the matrix VS I S' -1 CV T is not uniquely determined. 
Furthermore, even if the singular values of A are distinct but poor isolated 
(close to each other), a small perturbation to A can alter a singular vectors 
significantly [TU], and thus R changes significantly as well. This means, that in 
case of det(A) < and degenerative A or close to degenerative, matrix R is 
extremely sensitive to any changes in A. In particular, in this round-off 
errors presented in computation of A and SVD of A, can produce significantly 
different R. 

We note, that Umeyama [B] , in his derivation of the lemma, has not consid- 
ered the case when A is degenerative. 

Singular case of A: If A is singular and rank(A) = D — 1 (only a single 
singular value is zero), then M = SS = d(si, s%, . . . ,0) and 

XJSS = RVSS (32) 
If we define an orthogonal matrix K = U T RV, then 

KSS = SS. (33) 

Since the column vectors of K are orthonormal, then they are in the form 

k t = (0,0,...,l l; ...,0) T , forl<i<25-l (34) 

k D = (0,0,...,±1) T . (35) 
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Taking into account the constraint on determinant of R, we have 

det(K) = det(U T ) det(R) det(V) = det(UV T ) (36) 
Thus, we obtain: 

R = UKV T = UCV T , where C = d(l,l,...,l,det(UV T )). (37) 

Finally, if A is singular and rank(A) < D — 1 (A has multiple zero singular 
values), then matrix K is not uniquely determined. Precisely, one can choose 
arbitrary last column- vectors of K, (number of which is equivalent to the number 
of zero singular values) as far as they are orthonormal and dct(K) = det(UV T ). 
This gives a set of equivalent solutions for R. Additional information or con- 
straints require to make R unique. We note, that it is always possible to chose 
K according to Eq. |34J34| and find R from Eq. [37] 

Thus, we have considered all cases of A, which concludes the lemma. 

5 Discussion and conclusion 

The lemma is of general interest and is usefull in many computer vision and 
machine learning problems that can be simplified to maximization of tr(A T R). 
The lemma shows the optimal solution for the rotation matrix R. In most 
of the cases R is uniquely determined. In the case when rank(A) < D — 1 
and in the degenerate case, when the smallest singular value is not distinct 
and det(A) < 0, the presented solution for R is still a global optimum of the 
function, but it is not unique. Also, we have shown, that in these degenerative 
cases, R is extremely sensitive to round-off errors in A. In the cases when R is 
not unique, the solution given by Eq.[3Jshould be further justified by a particular 
problem. 

If we relax the constraint for R to be strictly rotational, and allow it to be 
any orthogonal (which allows for rotation and flip), then the derivation simplifies 
to the solution R = UV T , which was established by Schonemann [3J, and it is 
unique for all non-singular A. The lemma can be applied for the problems of 
arbitrary dimensions. 
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